Hierarchical Linearly-Solvable Markov Decision Problems

نویسندگان

  • Anders Jonsson
  • Vicenç Gómez
چکیده

We present a hierarchical reinforcement learning framework that formulates each task in the hierarchy as a special type of Markov decision process for which the Bellman equation is linear and has analytical solution. Problems of this type, called linearly-solvable MDPs (LMDPs) have interesting properties that can be exploited in a hierarchical setting, such as efficient learning of the optimal value function or task compositionality. The proposed hierarchical approach can also be seen as a novel alternative to solving LMDPs with large state spaces. We derive a hierarchical version of the socalled Z-learning algorithm that learns different tasks simultaneously and show empirically that it significantly outperforms the state-of-the-art learning methods in two classical hierarchical reinforcement learning domains: the taxi domain and an autonomous guided vehicle task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchy through Composition with Linearly Solvable Markov Decision Processes

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Current hierarchical frameworks execute actions serially, with macroactions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme uses the concurrent compositionality provided by the l...

متن کامل

A Unifying Framework for Linearly Solvable Control

Recent work has led to the development of an elegant theory of Linearly Solvable Markov Decision Processes (LMDPs) and related Path-Integral Control Problems. Traditionally, LMDPs have been formulated using stochastic policies and a control cost based on the KL divergence. In this paper, we extend this framework to a more general class of divergences: the Rényi divergences. These are a more gen...

متن کامل

Hierarchy Through Composition with Multitask LMDPs

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositional...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Efficient Learning in Linearly Solvable MDP Models

Linearly solvable Markov Decision Process (MDP) models are a powerful subclass of problems with a simple structure that allow the policy to be written directly in terms of the uncontrolled (passive) dynamics of the environment and the goals of the agent. However, there have been no learning algorithms for this class of models. In this research, we develop a robust learning approach to linearly ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016